GAMs with integrated model selection using penalized regression splines and applications to environmental modelling

نویسندگان

  • Simon N. Wood
  • Nicole H. Augustin
چکیده

Generalized Additive Models (GAMs) have been popularized by the work of Hastie and Tibshirani (1990) and the availability of user friendly GAM software in Splus. However, whilst it is flexible and efficient, the GAM framework based on backfitting with linear smoothers presents some difficulties when it comes to model selection and inference. On the other hand, the mathematically elegant work of Wahba (1990) and co-workers on Generalized Spline Smoothing (GSS) provides a rigorous framework for model selection (Gu and Wahba, 1991) and inference with GAMs constructed from smoothing splines: but unfortunately these models are computationally very expensive with operations counts that are of cubic order in the number of data. A “middle way” between these approaches is to construct GAMs using penalized regression splines (see e.g. Wahba 1980, 1990; Eilers and Marx 1998, Wood 2000). In this paper we develop this idea and show how GAMs constructed using penalized regression splines can be used to get most of the practical benefits of GSS models, including well founded model selection and multi-dimensional smooth terms, with the ease of use and low computational cost of backfit GAMs. Inference with the resulting methods also requires slightly fewer approximations than are employed in the GAM modelling software provided in Splus. This paper presents the basic mathematical and numerical approach to GAMs implemented in the R package mgcv, and includes two environmental examples using the methods as implemented in the package.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Thin plate regression splines

I discuss the production of low rank smoothers for d ≥ 1 dimensional data, which can be fitted by regression or penalized regression methods. The smoothers are constructed by a simple transformation and truncation of the basis that arises from the solution of the thinplate spline smoothing problem, and are optimal in the sense that the truncation is designed to result in the minimum possible pe...

متن کامل

Generalized additive models in business and economics

The paper presents applications of a class of semi-parametric models called generalized additive models (GAMs) to several business and economic datasets. Applications include analysis of wage-education relationship, brand choice, and number of trips to a doctor’s office. The dependent variable may be continuous, categorical or count. These semiparametric models are flexible and robust extension...

متن کامل

On confidence intervals for GAMs based on penalized regression splines

Generalized additive models represented using penalized regression splines, estimated by penalized likelihood maximisation and with smoothness selected by generalized cross validation or similar criteria, provide a computationally efficient general framework for practical smooth modelling. Various authors have proposed approximate Bayesian interval estimates for such models, based on extensions...

متن کامل

A penalized framework for distributed lag non-linear models.

Distributed lag non-linear models (DLNMs) are a modelling tool for describing potentially non-linear and delayed dependencies. Here, we illustrate an extension of the DLNM framework through the use of penalized splines within generalized additive models (GAM). This extension offers built-in model selection procedures and the possibility of accommodating assumptions on the shape of the lag struc...

متن کامل

Penalized Bregman Divergence Estimation via Coordinate Descent

Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002